Large Neighborhood Local Search for the 
Maximum Set Packing Problem 

Maxim Sviridenko* and Justin Ward^ 
Department of Computer Science, University of Warwick 

Abstract 

In this paper we consider the classical maximum set packing problem 
where set cardinality is upper bounded by k. We show how to design 
a variant of a polynomial-time local search algorithm with performance 
guarantee (k + 2)/3. This local search algorithm is a special case of a 
more general procedure that allows to swap up to O(logn) elements per 
iteration. We also design problem instances with locality gap fc/3 even for 
a wide class of exponential time local search procedures, which can swap 
up to an elements for a constant c. This shows that our analysis of this 
class of algorithms is almost tight. 

1 Introduction 

In this paper, we consider the problem of maximum unweighted fc-set packing. 
In this problem, we are given a collection TV of n distinct fc-element subsets of 
some ground set X. We say that two sets A, B 6 TV conflict if they share an 
element and call a collection of mutually non-conflicting sets from TV a packing. 
Then, the goal of the unweighted fc-set packing problem is to find a packing 
A C TV of maximum cardinality. Here, we assume that each set has cardinality 
exactly k. This assumption is without loss of generality, since we can always 
add unique elements to each set of cardinality less than k to obtain such an 
instance. 

The maximum set packing problem is one the basic optimization problems. 
It received a significant amount of attention from researchers in the last few 
decades (see e.g. [5]). It is known that a simple local search algorithm that 
starts with an arbitrary feasible solution and tries to add a constant number 
of sets to the current solution while removing a constant number of conflicting 
sets has performance guarantee arbitrarily close to fc/2 [7J. It was also shown 
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in [7] that the analysis of such an algorithm is tight, i.e. there are maximum 
set covering instances where the ratio between a locally optimal solution value 
and the globally optimal solution value is arbitrarily close to fc/2. 

Surprisingly, Halldorsson[5] showed that if one increases the size of allowable 
swap to O(logn) the performance guarantee can be shown to be at most (k + 
2)/3. Recently, Cygan, Grandoni and Mastrolilli [3] improved the guarantee 
for the same algorithm to (k + l)/3. This performance guarantee is the best 
currently known for the maximum set packing problem. The obvious drawback 
of these algorithms is that it runs in time O(n log ") and therefore its running 
time not polynomial. 

Both algorithms rely only on the subset of swaps of size O(logn) to be 
able to prove their respective performance guarantees. The Halldorsson's swaps 
are particularly well structured and have a straightforward interpretation in 
the graph theoretic language. In section U we employ techniques from fixed- 
parameter tractability to yield a procedure for finding well-structured improve- 
ments of size O(logn) in polynomial time. Our algorithm is based on color 
coding technique introduced by Alon, Yuster, and Zwick [I] and its extension 
by Fellows et al. [4], and solves a dynamic program to locate an improvement 
if one exists. Combining with Halldorsson's analysis, we obtain a polynomial 
time ^^-approximation algorithm. In Section [5] we show that it is not possible 
to improve this result beyond | , even by choosing significantly larger improve- 
ments. Specifically, we construct a family of instances in which the locality gap 
for a local search algorithm applying all improvements of size t remains at least 
| even when t is allowed to grow linearly with n. Our lower bound thus holds 
even for local search algorithms that are allowed to examine some exponential 
number of possible improvements at each stage. 

2 A Quasi-Polynomial Time Local Search Algo- 
rithm 

Let A be a packing. We define an auxiliary multigraph G4 whose vertices 
correspond to sets in A and whose edges correspond to sets in M \ A that 
conflict with at most 2 sets in A. That is, E(Ga) contains a separate edge 
(S, T) for each set X G Af \ A that conflicts with exactly two sets S and T in 
A, and a loop on S for each set X S J\f that conflicts with exactly one set S in 
A. In order to simplify our analysis, we additionally say that each set X E A 
conflicts with itself, and place such a loop on each set of A. Note that 
contains 0(n) vertices and 0(n) edges, for any value of A. 

Our local search algorithm uses G4 to search for improvements to the current 
solution A. Formally, we call a set / of t edges in G4 a t- improvement if / covers 
at most t — 1 vertices of G4 and the sets of M \ A corresponding to the edges 
in / are mutually disjoint. 

Note that if / is a ^-improvement for a packing A, we can obtain a larger 
packing by removing the at most t — 1 sets covered by / from A and then 
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(c) 

Figure 1: Canonical Improvements 

adding the t sets corresponding to the edges of / to the result. We limit our 
search for improvements in GU to those that exhibit the following particular 
form: an improvement is a canonical improvement if it forms a connected graph 
containing two distinct cycles. A general canonical improvement then comprises 
either 2 edge-disjoint cycles joined by a path, two edge-disjoint cycles that share 
a single vertex, or two distinct vertices joined by 3 edge-disjoint paths (see Figure 

Our algorithm, shown in Figure [2] proceeds by repeatedly calling the pro- 
cedure Improve(GU), which searches for a canonical (41ogn + l)-improvement 
in the graph G4. Before searching for a canonical improvement, we first en- 
sure that A is a maximal packing by greedily adding sets from AT \ A to A. If 
Improve(GU) returns an improvement J, then / is applied to the current so- 
lution and the search continues. Otherwise, the current solution A is returned. 

In Section |3l we analyze the approximation performance of the local search 
algorithm under the assumption that Improve(GU) always finds a canonical 
(41ogn + l)-improvcmcnt, whenever such an improvement exists. In Section 
[4j we provide such an implementation Improve(GU) that runs in deterministic 
polynomial time. 

1 It can be shown that every t-improvcmcnt must contain a canonical (-improvement, and 
so we are not in fact restricting the search space at all by considering only canonical improve- 
ments. However, this fact will not be necessary for our analysis. 
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A^$ 
loop 

for all S e M \ A do 

if S does not conflict with any set of A then 

A<-AU{S} 
end if 
end for 

Construct the auxiliary graph GU for A 
I <- Improve(Ga) 
if 7 = then 
return A 
else 

A^(A\V(I))UE(I) 
end if 
end loop 

Figure 2: The General Local Search Procedure 

3 Locality Gap of the Algorithm 

In this section we prove the following upper bound on the locality gap for our 
algorithm. We consider an arbitrary instance J\f of &;-set packing, and let A be 
the packing in N produced by our local search algorithm and B be any other 
packing in TV. 

Theorem 3.1. \B\ < ^\A\. 

For the purpose of our analysis, we consider the subgraph EIa.b of GU consist- 
ing of only those edges of GU corresponding to sets in B. Then, every collection 
of edges in Ha,b is also present in G4. Moreover, because the edges of Ha,b all 
belong to the packing B, any subset of them must be mutually disjoint. Thus, 
we can assume that no collection of at most 4 log n + 1 edges from Ha,b form any 
of the structures shown in Figure [TJ Otherwise, the corresponding collection of 
edges in G4 would form a canonical (41ogn + l)-improvement. 

In order to prove Theorem 13 - 1 1 we make use of the following lemma of 
Berman and Fiirer [2], which gives conditions under which the multigraph H-a.b 
must contain a canonical improvement!! We provide Berman and Fiirer's proof 
in the appendix. 

Lemma 3.2 (Lemma 3.2 in [2J. Assume that \E\ > ^-\V\ in a multigraph 
H = (V, E). Then, H contains a canonical improvement with at most 4p log n—1 
vertices. 

2 Berman and Fiirer call structures of the form shown in Figure [T] "binoculars." Here, we 
have rephrased their lemma in our own terminology. 
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It will also be necessary to bound the total number of loops in Hj^ g. In order 
to do this, we shall consider a second auxiliary graph B that is obtained from 
Ha.b in the following fashion: 

Lemma 3.3. Let H = (V,E) be a multigraph and let H' — (V',E') be obtained 
from H by deleting all vertices of H with loops on them and, for each edge with 
one endpoint incident to a deleted vertex, introducing a new loop on this edge 's 
remaining vertex. Let t > 3. Then, if H' contains a canonical t -improvement, 
H contains a canonical (t + 2) -improvement. 

A proof of Lemma 13.31 based on a sketch given by Halldorsson [5J , appears 
in the appendix. 

We now turn to the proof of Theorem 13.11 Every set in B must conflict with 
some set in A, or else A would not be maximal. We partition the sets of B 
into three collections of sets, depending on how many sets in A they conflict 
with. Let £>i,£>2, and B3 be collections of those sets of B that conflict with, 
respectively, exactly 1, exactly 2, and 3 or more sets in A (note that each set 
of A n B is counted in B\ , since we have adopted the convention that such sets 
conflict with themselves). 

Because each set in A contains at most k elements and the sets in B are 
mutually disjoint, we have the inequality 

\B 1 \+2\B 2 \ + 3\B 3 \<k\A\. (1) 

We now bound the size of B\ and £>2- 

Let A\ be the collection of sets from A that conflict with sets of B\ . Then, 
note that each set of B\ corresponds to a loop in Ha,b and the sets of A\ 
correspond to the vertices on which these loops occur. Any vertex of Ra,b with 
two loops would form an improvement of the form shown in Figure Hal Thus, 
each vertex in -ff^g has at most 1 loop and hence: 

\Bi\ = \At\. (2) 

Now, we show that |£> 2 < 2|yl\.Ai|. By way of contradiction, suppose that 
|£>2| > 2|„4\^4i|. We construct an auxiliary graph W A B from i?4.e as in Lemma 
13.31 The number of edges in this graph is exactly \Bi \ and the number of vertices 
is exactly |A\>ti|. Thus, if \Bi\ < 2\A\Ai\, then from Lemma 1331 (with p = 1), 
there is a canonical improvement in B' A B of size at most 41ogn — 1. But, from 
Lemma 13.31 this means there must be a canonical improvement in -ff^.g of size 
at most 41ogn + 1, contradicting the local optimality of A. Thus, 

\B^\<2\A\A X \ (3) 

Adding (fTJ), twice (J2)), and (|3|), we obtain 

3|Bi| +3|B 2 | +3|B 3 | < fcL4|+2L4i|+2L4\Ai|, 

which implies that 3|B| < (k + 2)\A\. 
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4 Finding Canonical Improvements 



A naive implementation of the local search algorithm described in Section [2] 
would run in only quasi-polynomial time, since at each step there are nP^ iosn ^ 
possible improvements of size t — 41ogn+l. In contrast, we now show that it is 
possible to find a canonical improvement of size t in polynomial time whenever 
one exists. 

We first give a randomized algorithm, using the color coding approach of 
Alon, Yuster, and Zwick [1]. If some ^-improvement exists, our algorithm finds 
it with polynomially small probability. In Section [5j we show how to use this 
algorithm to implement a local search algorithm that succeeds with high proba- 
bility, and how to obtain to obtain a deterministic variant via derandomization. 

We now describe the basic, randomized color coding algorithm. Again, con- 
sider an arbitrary instance N of fc-set packing and let X be the ground set of J\f. 
Let K be a collection of kt colors. We assign each element of X a color from K 
uniformly at random, and assign each fc-set from M the set of all its elements' 
colors. We say that a collection of sets A C A/" is colorful if no color appears 
twice amongst the sets of A. We note that if a collection of sets A is colorful, 
then A must form a packing, since no two sets in A can share an element. 

We assign each edge of GU the same set of colors as its corresponding set 
in A/", and, similarly, say that a collection of edges colorful if the corresponding 
collection of sets from Af is colorful. Now, we consider a subgraph of Ga made 
up of some set of at most t edges /. If this graph has the one of the forms shown 
in Figure [T] and / is colorful, then / must be a canonical ^-improvement. We 
now show that, although the converse does not hold, our random coloring makes 
any given canonical i-improvement colorful with probability only polynomially 
small in n. 

Consider a canonical improvement I of size 1 < i < t. The i sets correspond- 
ing to the edges of / must be disjoint and so consist of ki separate elements. 
The probability that I is colorful is precisely the probability that all of these ki 
elements are assigned distinct colors. This probability can be estimated as 

(fci)( fct ) ! = > ^ p -kt _ -4fclogn-fc _ -k -Sk (A) 

{kt) kl {kt - ki)\[kt) ki ~ (kt) kt en, 

where in the last line, we have used the fact that e logn = e lnnlo s e = n loge < n 2 . 

We now show how to use this random coloring to find canonical improve- 
ments in G4. Our approach is based on finding colorful paths and cycles and 
employs dynamic programming. 

We give a dynamic program that, given a coloring for edges of GU, as de- 
scribed above, finds a colorful path of length at most t in G4 between each pair 
of vertices S and T, if such a path exists. For each vertex S and T of G4, each 
value i < t, and each set C of ki colors from K, we have an entry V(S, T, i, C) 
that records whether or not there is some colorful path of length i between S 
and T whose edges are colored with precisely those colors in C. In our table, 
we explicitly include the case that S = T. 
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We compute the entries of V bottom-up in the following fashion. We set 
T>(S, T, 0, C) = for all pairs of vertices 5, T, and C. Then, we compute the 
entries V(S,T,i,C) for i > as follows. We set V(S,T,i,C) = 1, if there is 
some edge (V, T) incident to vertex T in G4 such that (V, T) is colored with a 
set of k distinct colors B C C and the entry T>(S, V,i — 1,C\B) = 1. Otherwise, 
we set V(S,T,iC) = 0. 

To determine if G4 contains a colorful path of given length i < t from S to 
T, we simply check whether T>(S, T, i, C) = 1 for some set of colors C. Similarly, 
we can use our dynamic program to find colorful cycles of length j that include 
some given vertex U by consulting V(U,U,j,C) for each set of colors C. The 
actual path or cycle can then be found by backtracking through the table V. We 
note that while the cycles and paths found by this procedure are not necessarily 
simple, they are edge-disjoint. 

For each value of i, there are at most ?i 2 (^') entries in T>(S,T,i,C). To 
compute each such entry, we examine each edge (V 7 , T) incident to T, check if 
(V, T) is colored with a set of k colors B C C and consult T>(S, T,i,C \ B), all 
which can be accomplished in time 0(nki). Thus, the total time to compute T> 
up to i = t is of order: 

^n s ki( ke ] <n A kt2 kt 



/ 11/ ftj L I 

^-^ \ki 

i=l v 

In order to find a canonical t-improvement, we first compute the table T> 
up to i — t. Then, we search for improvements of each kind shown in Figure 
Q] by enumerating over all choices of S and T, and looking for an appropriate 
collection of cycles or paths involving these vertices that use mutually disjoint 
sets of colors. Specifically: 

• To find improvements of the form shown in Figure [Tal we enumerate over 
all n vertices S. For all disjoint sets of ka and kb colors C a and Cf, with 
a + b < t, we check if V(S, S, a, C a ) = 1 and V{S, S, b, C b ) = 1. This can 
be accomplished in time 



, J2 2 ka 2 kb kt = 0(nkt 3 2 kt ) 



a=l 6=1 

To find improvements of the form shown in Figure llbl we enumerate over 
all distinct vertices S and T. For all disjoint sets of ka, kb, and kc colors 
C a ,C b ,and C c with \C a \ + \C b \ + \C C \ < t, we check if V{S,S,a,C a ) = 1, 
V(T,T,b,C b ) = 1, and V(S,T,c,C c ) = 1. This can be accomplished in 
time 

t t—at—a—b 

2 



2 ka 2 kb 2 kc kt = 0{n 2 kt 4 2 kt ) 



n 

a=l 6=1 c=l 



To find improvements of the form shown in Figure [Tc] we again enumerate 
over all distinct vertices S and T. For all disjoint sets of ka, kb, and kc 
colors C a A,andC c with |C a | + |C6| + |C c | < t, we check if T>{S, T, a, C a ) = 
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1, T>(S, T, b, Cb) = 1, and T>(S, T, c, C c ) = 1. This can be accomplished in 
time 



t t — a t — a—b 

,2 



2 ka 2 kb 2 kc kt = 0(n 2 kt 4 2 kt ) 



n 

a=l 6=1 c=l 

Thus, the total time spent searching for a canonical t- improvement is then 
at most: 

0(n 2 kt2 kt + nkt 3 2 kt + 2n 2 kt 4 2 kt ) = 0{n 2 kt 4 2 kt ) = 0(2 k k ■ n 4k+2 log 4 n). 

5 The Deterministic, Large Neighborhood Lo- 
cal Search Algorithm 

The analysis of the local search algorithm in Section [3] supposed that every 
call to Improve(GU) returns only when no canonical ^-improvement exists in 
GU- Under this assumption, the algorithm is a ^^-approximation. In contrast, 
the dynamic programming implementation given in Section [4] may fail to find a 
canonical improvement / if the chosen random coloring does not make / colorful. 
As we have shown in ((4]), this can happen with probability at most (1 — e~ k n~ 8k ). 

Suppose that we implement each call to Improve(GU) by running the al- 
gorithm of Section 2] cN = ce k n 8k Inn times, each with a different random 
coloring. We now show that the resulting algorithm is a polynomial time ^y^- 
approximation with high probability 1 — n 1 ~ c . 

We note that each improvement found by our local search algorithm must 
increase the size of the packing A, and so the algorithm makes at most n calls 
to Improve(GU)- We set N = e k n sk+1 Inn, and then implement each such 
call by repeating the color coding algorithm of Section 2] cN times for some 
c > 1, each with an new random coloring. The probability that any given call 
Improve(GU) succeeds in finding a canonical i-improvement when one exists is 
then at least: 

1 - (1 - e - k n- sk ) cN > 1 - expje^n- 8 * • ce k n sk lnn} = 1 - n~ c . 

And so, the probability that all calls to Improve(G_4) satisfy the assumptions 
of Theorem 13. H is at least: 

(1 -n,- c ) n > l-n 1 - c 

The resulting algorithm is therefore a -^i^-approximation with high proba- 
bility. It requires at most n calls to Improve(GU), each requiring total time 

0(cN ■ 2 k kn 4k+2 log 4 n) = 0(c(2e) k kn 12k+2 log™ n Inn) = cn° (k) 

Using the general approach described by Alon, Yuster, and Zwick Q], we 
can in fact give a deterministic implementation of Improve(G4), which always 
succeeds in finding a canonical t-improvement in Ga if such an improvement 



8 



exists. Rather than choosing a coloring of the ground set X at random, we use 
a collection JC of colorings (each of which is given as a mapping X — »■ K) with 
the property that every canonical t-improvement G4 is colorful with respect to 
some coloring in JC. For this, it is sufficient to hnd a collection of JC of colorings 
such that for every set of at most kt elements in X, there is some coloring in JC 
that assigns these kt elements kt distinct colors from K. Then, we implement 
Improve(GU) by running the dynamic programming algorithm of Section[5]on 
each such coloring, and returning the first improvement found. Because every 
canonical t-improvement contains at most kt distinct elements of the ground 
set, every such improvement must be made colorful by some coloring in JC, and 
so Improve(GU) will always find a canonical t-improvement if one exists. 

We now show how to construct the desired collection of colorings JC by 
using a kt-perfect family of hash functions from X — > K . Briefly, a perfect 
hash function for a set S C A is a mapping from A to B that is one-to-one 
on S. A p-perfect family is then collection of perfect hash functions, one for 
each set S C A of size at most p. Building on work by Fredman, Komlos and 
Szemeredi [TU] and Schmidt and Siegal [5], Alon and Naor show (in Theorem 
3 of how to explicitly construct a perfect hash function from [m] to [p] 
for some S C [m] of size p in time 0{p\ogm). This hash function is described 
in 0{p + \ogp ■ log log m) bits. The maximum size of a p-pcrfect family of 
such functions is therefore 2 ( p+losp ' loglogm ). Moreover, the function can be 
evaluated in time 0(\ogm/ \ogp). 

Then, we can obtain a deterministic, polynomial time approximation 
as follows. Upon receiving the set packing instance Af with ground set X, 
we compute a fci-perfect family JC of hash functions from X to a set of kt 
colors K. Then, we implement each call to Improve(GU) as described, by 
enumerating over the colorings in JC. We note that since each set in J\f has size 
k, \X\ < \J\f\k — nk, so each improvement makes at most 

^0(kt-\-log fci-log log kn) c^J[k log n+log(fc log n) -log log kn) n®(^ 

calls to the dynamic programming algorithm of Section [S] (one per coloring in 
JC) and each of these calls takes time at most (including the time to 

evaluate the coloring on each element of the ground set). Moreover, the initial 
construction of JC takes time at most 2 kt O(kt\ogkn) = n°( k \ 

6 A Lower Bound 

We now show that our analysis is almost tight. Specifically, we show that the 
locality gap of t-local search is least |, even when t is allowed to grow on the 
order of n. 

Theorem 6.1. Let c = -^sj: & n d suppose that t < cn for all sufficiently large n. 
There, there exist 2 pairwise disjoint collections of k-sets S and O with \S\ — 3n 
and \0\ — kn such that any collection of a <t sets in O conflict with at least a 
sets in S. 
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In order to prove Theorem 16.11 we make use of the following (random) con- 
struction. Let X be a ground set of 3kn elements, and consider a collection S 
of 3n sets, each containing k distinct elements of X. We construct a random 
collection 1Z of kn disjoint subsets of X, each containing 3 elements. 

The number of distinct collections generated by this procedure is equal to 
the number of ways to partition the 3kn elements of X into kn disjoint 3-sets. 
We define the quantity r(m) to be the number of ways that m elements can be 
divided into m/3 disjoint 3-sets: 



v ' (3!) m /3( TO /3)r 

In order to verify the above formula, consider the following procedure for gen- 
erating a random partition. We first arrange the m elements in some order and 
then make a 3-set from the elements at positions 3z, 3i — 1 and 3« — 2, for each 
j G [|] (that is, we group each set of 3 consecutive elements in the ordering 
into a triple) . Now, we note that two permutations of the elements produce the 
same partition if they differ only in the ordering of the 3 elements within each 
of the m/3 triples or in the ordering of the m/3 triples themselves. Thus, each 
partition occurs in exactly (3!) m / 3 (m/3)! of the ml possible orderings. 

The probability that any particular collection of a disjoint 3-sets occurs in 
7Z is given by 

. . A r(3kn — 3a) 
= r(3fcn) ■ 

This is simply the number of ways to partition the remaining 3fcn — 3a elements 
into kn — a disjoint 3-sets, divided by the total number of possible partitions of 
all 3kn elements. 

We say that a collection A of a sets in S is unstable if there is some collection 
B of at least a sets in 1Z that conflict with only those sets in A. Note that there 
is an improvement of size a for S only if there is some unstable collection A of 
size a m We now derive an upper bound on the probability that our random 
construction of 1Z results in a given collection A in S being unstable. 

Lemma 6.2. A collection of a sets in S is unstable with probability less than 

UdaXda) 

(IS) ' 

Proof. A collection A of a fc-sets from <S is unstable precisely when there is a 
collection B of a 3-sets in 1Z that contain only those k(a — 1) elements appearing 
in the sets of A. There are (3°) ways to choose the 3a elements from which we 
construct B. For each such choice, there are T(3a) possible ways to partition the 
elements into 3-sets, each occurring with probability p(3a) = r(3fcn — 3a)/r(3n). 

3 In fact, for an improvement to exist, there must be some collection B of a + 1 such sets 
in TL. This stronger condition is unnecessary for our bound, however. 
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Applying the union bound, the probability that A is unstable is then at most: 

(ka\ T(3kn-3a) _ /ka\ (3a)! [3(kn-a))\ (3!) fell (fcn)! 
a > T {3kn) ~ \3a) (3!) a a! ' (3\) kn - a (kn - a)\ ' [3kn)\ 
fka\ (3a)\(3(kn - a))\ {kn)\ 
~ \3a) (3kn)\ (kn-a)\a\ 

{ ka\ { kn\ 
_ \3a) \ a ) 
(3kn\ 
\3al 



□ 

Theorem \6.1l Let U a be number of unstable collections of size a in 5, and 
consider E[[/J. There are precisely ( 3 ™) such collections, and from Lemma 16721 

each occurs with probability less than /°4„^ . Thus: 



(3n\ { ka\ { kn 
a J \3aJ V a 

\ 3a ) 

Applying the upper and lower bounds 

k 



E[tJ j < www (5 ) 



in the numerator and denominator, respectively, of <j5j) , we obtain the upper 
bound 

(e3n) a (eka) 3a (ekn) a (3a) 3a /e 5 3 4 fc 4 o 6 n 2 \ a fe 5 ka 



a a (3a) 3a a a (3kn) 3a \ 3 6 k 3 a 5 n 3 J \ 9n 

Then, the expected number of unstable collections in S of size at most t (and 
hence the expected number of i-improvements for S) is less than 

t t / k , v a 

E-PJ c» 

a=l a=l v 7 

For all sufficiently large n, we have a < t < cn and so 

v ( e5ka Y < v f e5fccn Y - v f——Y -^r(-Y 

a=l v 7 a=l v 7 a=l v 7 a=l v 7 

Thus, there must exist some collection O in the support of TZ that creates no 
unstable collections of size at most t in iS. Then, O is a collection of pairwise 
disjoint sets of size kn satisfying the conditions of the theorem. □ 
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7 Conclusion 



We have given a polynomial time — g— approximation algorithm for the problem 
of fc-set packing. Our algorithm is based on a simple local search algorithm, but 
incorporates ideas from fixed parameter tractability to search large neighbor- 
hoods efficiently, allowing us to achieve an approximation guarantee exceeding 
the fc/2 bound of Hurkens and Schrijver [7J. In contrast, our lower bound of 
fc/3 shows that local search algorithms considering still larger neighborhoods, 
including neighborhoods of exponential size, can yield only slight improvements. 

An interesting direction for future research would be to close the gap be- 
tween our fc/3 lower bound and upper bound. Recently, Cygan, Grandoni, 
and Mastrolilli [3] have given a quasi-polynomial time local search algorithm 
attaining an approximation ratio of (fc + l)/3. Their analysis is also based on 
that of Berman and Furer [2] and Halld6rsson[5], but their algorithm requires 
searching for improvements with a more general structure than those that we 
consider, and it is unclear how to apply similar techniques as ours in this case. 
Nevertheless, we conjecture that it is possible to attain an approximation ratio 
of in polynomial time, although this will likely require more sophisticated 
techniques than we consider here. 

In contrast to all known positive results, the best known NP-hardness result 
for fc-set packing is, due to Hazan, Safra, and Schwartz [6j, is only 0(fc/lnfc). 
A more general open problem is whether the gap between this result and algo- 
rithmic results can be narrowed. 

Finally, we ask whether our results can be generalized to the independent set 
problem in (fc + l)-claw free graphs. Most known algorithms for fc-set packing, 
including those given by Halldorsson [5] and Cygan, Grandoni, and Mastrolilli 
[3] generalized trivially to this setting. However, this does not seem to be the 
case for the color coding approach that we employ, as it relies on the set packing 
representation of problem instances. 

Acknowledgements 

We would like to thank Oleg Pikhurko for extremely enlightening discussion on 
random graphs. 

References 

[1] N. Alon, R. Yuster, and U. Zwick, Color-coding, Journal of the ACM, 42(4), pp. 
844-856 (1995) 

[2] P. Berman and M. Finer, Approximating maximum independent set in bounded 
degree graphs, SODA (1994) 

[3] M. Cygan, F. Grandoni, and M. Mastrolilli, How to sell hyperedges: the hyper- 
matching assignment problem, SODA (2013) 

[4] M. Fellows, C. Knauer, and N. Nishimura, Faster fixed-parameter tractable algo- 
rithms for matching and packing problems, Algorithmica, pp. 167-176 (2004) 



12 



[5] M. Halldorsson, Approximating Discrete Collections via Local Improvements, 
SODA, (1995) 

[6] Hazan, S. Safra, and O. Schwartz, On the complexity of approximating k-set pack- 
ing. Computational Complexity, 15(1), pp. 2039 (2006) 

[7] C. Hurkens and A. Schrijver, On the Size of Systems of Sets Every t of Which Have 
an SDR, with an Application to the Worst-Case Ratio of Heuristics for Packing 
Problems, SIAM Journal of Discrete Mathematics 2(1), pp. 68-72 (1989) 

[8] V. Paschos, A survey of approximately optimal solutions to some covering and 
packing problems, ACM Computing Surveys 29(2), pp. 171 - 209 (1997) 

[9] J. P. Schmidt and A. Siegel, The spatial complexity of oblivious k-probe hash func- 
tions. SIAM Journal on Computing, 19(5), pp. 775-786 (1990) 

[10] M. Fredman, J. Komlos, and E. Szemeredi, Storing a sparse table with 0(1) worst 
case access time. Journal of the ACM, 31(3), pp. 538-544, (1984) 

[11] N. Alon and M. Naor, Derandomization, witnesses for boolean matrix multiplica- 
tion and construction of perfect hash functions. Algorithmica, 16(4/5), pp. 434-449 
(1996) 

A Appendix 

Here we provide detailed proofs of the cited technical results. 

Lemma 3.2 (Lemma 3.2 in j2]). Assume that \E\ > |V| in a multigraph 
H = (V, E). Then, H contains a canonical improvement with at most 4plogn— 1 
vertices. 

Proof. Suppose that H' is the smallest induced subgraph of H that satisfies the 
condition 

E(H') > JL-V{H'). (7) 

We shall show that H' must contain a canonical improvement. First, we note 
that H' cannot contain any degree 1 vertices. Otherwise, we could remove all 
such vertices to obtain a smaller graph satisfying (J7J. Moreover, any chain of 
degree 2 vertices in H' has fewer than p vertices. Otherwise, we could remove 
this chain of p vertices, together with the p + 1 edges incident on them to obtain 
a smaller graph satisfying (0. We replace every chain of degree 2 vertices in 
H' with a single edge connecting its the endpoints to obtain a graph H3 with 
minimum degree 3. 

Let I3 be a minimal connected subgraph of H3 with exactly 2 distinct cy- 
cles. Then, I3 is a canonical improvement in H3 and |V(i3)| < |.E(l3)| + 1. By 
expanding each contracted chains of vertices in I3, we obtain a connected sub- 
graph of H' that contains 2 distinct cycles.Then, \V(I')\ < p(\V(I 3 )\ + l). Thus, 
to complete the proof of Lemma 13.21 it suffices to show that H3 must contain a 
connected subgraph ^3 with exactly 2 distinct cycles and |V(/3)| < 4 log n — 1. 

Let n = \V\ and note that \V(H 3 )\ < \V(H')\ < V(H) < n. We first note 
that H3 has maximum girth less than 21ogn. To prove this, simply construct 
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a breadth first search tree rooted at some vertex in the graph. There must be 
some vertex v of distance less than log n from the root without 2 children. Since 
v has degree 3, it must have an edge to some previously visited vertex u in the 
tree (where possibly u = v). The paths from u and v to the root, together with 
the edge (u,v) form a connected subgraph C that has at most 21ogn vertices 
and contains both the root of the tree and a cycle. Let C be a minimal such 
subgraph. We contract C to a single vertex and call the resulting graph H' 3 . 
Then, must also minimum degree 3 and at most n vertices. Repeating the 
argument we can find a minimal subgraph C in with at most 2 log n vertices 
that contains both the root of H' z and a cycle. Let J3 be the induced subgraph of 
i?3 containing the vertices of C and C . Then, J3 has at most 4 log n—1 vertices, 
and is a connected subgraph of H 3 containing exactly 2 distinct cycles. □ 

Lemma 3.3. Let H = (V,E) be a multigraph and let H' = (V',E') be obtained 
from H by deleting all vertices of H with loops on them and, for each edge with 
one endpoint incident to a deleted vertex, introducing a new loop on this edge's 
remaining vertex. Let t > 3. Then, if H' contains a canonical t -improvement, 
H contains a canonical (t + 2) -improvement. 

Proof. Our argument is based on a sketch given by Halldorsson [5]. In the 
interest of completeness, we present a more detailed argument here. 

Note that any canonical t improvement in H' that does not contain a loop 
is also present in H. Consider, then, a canonical i-improvement / in H' which 
contains either one or two loops (i.e. an improvement of the form [Ta| and lib! 
where one or both of the cycles are loops) . A loop on vertex v in H' corresponds 
to an edge (u, v) in H, where v has a loop. The improvement I in H' must have 
2 cycles joined by either a path or a single vertex. Figure [3] illustrates all of the 
possible configurations for J in H' and the related canonical improvements in 
H, which we now show must exist. 

If exactly one of these cycles is a loop on some vertex v, then we must have 
a path (possibly of length 0) joining v to another cycle J in H'. This path and 
J are also present in H . Additionally, in H we must have an edge (u,v) and a 
loop on u. Thus, the loop on it, together with the edge u, v, the cycle J and the 
path connecting v to J form a canonical improvement with only one more edge 
than /. 

Now, suppose that both of these cycles are loops on some vertices Vi and 
V2 (where possibly v\ = U2). If the corresponding edges (vi,ui) and ^2,^2 in 
H have distinct endpoints u\ ^ u%, then the two loops on u\ and u-i together 
with these edges and the path (of length 0, in the case that v\ — v?) joining 
V\ and V2 form a canonical improvement. If u\ = 112, then the edges (i>i,Wi) 
and (1)2,11,2), together with the path (again, of length if v% = 1)2) from v± to 
V2 forms a cycle. This, together with the loop on the vertex u\ =112, forms 
a canonical improvement. In both cases, the canonical improvement in H has 
only 2 more edges than /. □ 



14 



Improvement in H' 
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